Let’s try bootstrapped likelihood ratio test (BLRT).
b3 <- lmeresampler::bootstrap(model_3_ML, .f = logLik, type = "parametric", B = 1000)
b4 <- lmeresampler::bootstrap(model_4_ML, .f = logLik, type = "parametric", B = 1000)
lrt_b <- -2 * b4$replicates + 2 * b3$replicates
quantile(lrt_b, probs = c(.025, .975))
## 2.5% 97.5%
## -81.85642 98.24528
CI does include 0 -> no model is significantly better.
-> ICs indicate better fit of model_4 and, although more parsimonious, it doesn’t fit the data significantly worse than model_3.
If your theory doesn’t emphasize the random slope, than just the random intercept seems to be sufficient.
Always pick the model that fits your theory.
Note: Theory is fixed before the analysis.